JCIORec'dPCT/PTO 2 2 MAR 200f 

SEQUENCE LISTING 



<110> Suntory Limited 

<120> Homeobox gene coding for a protein involved in 
differentiation 

<130> H773 

<160> 8 

<210> 1 

<211> 1214 

<212> DNA 

<213> Arabidopsis thaliana 

<221> CDS 

<222> ( 36 ) . • . ( 1010 ) 

<223> Nucleotide sequence coding for a protein involved 
in differentiation 

<400> 1 



ctttagctct cgattatcat cattacacca tcatc atg tec tec tea aac aaa 

Met Ser Ser Ser Asn Lys 
1 5 

aat tgg oca age atg tte aaa tec aaa cct tgc aac aat aat cat cat 

Asn Trp Pro Ser Met Phe Lys Ser Lys Pro Cys Asn Asn Asn His His 

10 15 20 

cat caa eat gaa ate gat act cea tet tac atg cae tae tet aat tgc 

His Gin His Glu lie Asp Thr Pro Ser Tyr Met His Tyr Ser Asn Cys 

25 30 35 

aac eta tea tet tec ttt tee tea gat. egg ata eca gat cct aaa eeg 

Asn Leu Ser Ser Ser Phe Ser Ser Asp Arg lie Pro Asp Pro Lys Pro 

40 45 50 

aga tgg aat cct aaa eeg gag cag att agg ata etc gaa tea ate ttc 

Arg Trp Asn Pro Lys Pro Glu Gin lie Arg lie Leu Glu Ser lie Phe 

55 60 65 70 



101 



149 



197 



245 
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aat tec ggt act att aac 
Asn Ser Gly Thr lie Asn 
75 

a t c egg c 1 1 caa gaa ta t 
lie Arg Leu Gin Glu Tyr 
90 

tgg ttt caa aac egg aaa 
Trp Phe Gin Asn Arg Lys 
105 

cac aaa age ect aaa atg 
His Lys Ser Pro Lys Met 
120 

act gac get gat cat tgt 
Thr Asp Ala Asp His Cys 
135 140 
ta t ecg g 1 1 caa aac aa t 
Tyr Pro Val Gin Asn Asn 
155 

eta ttt ecg gtt eat aat 
Leu Phe Pro Val His Asn 
170 

ttt ggc gat ttt gtt gta 
Phe Gly Asp Phe Val Val 
185 

tct ace gtt aat aac ggc 
Ser Thr Val Asn Asn Gly 
200 

aaa att ecg geg ate aat 
Lys lie Pro Ala lie Asn 
215 220 
aat tgt ttt ect ect ttg 
Asn Cys Phe Pro Pro Leu 
235 

gaa aaa ega gat gta gga 
Glu Lys Arg Asp Val Gly 
250 



cea ect aga gag gag att 

Pro Pro Arg Glu Glu lie 
80 

ggt caa ate ggt gac gea 

Gly Gin lie Gly Asp Ala 
95 

tct ega gca aaa cac aag 

Ser Arg Ala Lys His Lys 
110 

tea aag aag gac aag acg 

Ser Lys Lys Asp Lys Thr 

125 130 

ttt ggt ttt gtt aac caa 

Phe Gly Phe Val Asn Gin 
145 

gag ttg gtg gta ace gaa 

Glu Leu Val Val Thr Glu 
160 

gat ecg age get get caa 

Asp Pro Ser Ala Ala Gin 
175 

ecg gtg gta acg gaa gaa 

Pro Val Val Thr Glu Glu 
190 

gtt aat ttg gag act aac 

Val Asn Leu Glu Thr Asn 

205 210 
tta tac ggc gga gat gga 

Leu Tyr Gly Gly Asp Gly 
225 

act gtt cea tta ace ate 

Thr Val Pro Leu Thr lie 
240 

tta tec ggt ggt gaa gac 

Leu Ser Gly Gly Glu Asp 
255 




caa aga ate egg 293 
Gin Arg lie Arg 
85 

aac gtg ttt tac 341 
Asn Val Phe Tyr 
100 

ett egt gtt eat 389 

Leu Arg Val His 

115 

gtt att ect agt 437 
Val lie Pro Ser 

gaa acc gga tta 485 

Glu Thr Gly Leu 
150 

ecg gee ggt ttt 533 

Pro Ala Gly Phe 
165 

tea geg ttt ggt 581 

Ser Ala Phe Gly 
180 

ggg atg gca tte 629 

Gly Met Ala Phe 

195 

gaa aat ttt gat 677 

Glu Asn Phe Asp 

aat ggc ggt gga 725 
Asn Gly Gly Gly 
230 

aat caa tct caa 773 
Asn Gin Ser Gin 
245 

gtc gga gat aat 821 
Val Gly Asp Asn 
260 
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gtt tat ccg gtg aga atg acg gtg ttt att aac gag atg cct ate gaa 869 
Val Tyr Pro Val Arg Met Thr Val Phe lie Asn Glu Met Pro lie Glu 

265 270 275 

gta gtg tct gga tta ttc aac gtt aag gca get ttc gga aae gat gee 917 
Val Val Ser Gly Leu Phe Asn Val Lys Ala Ala Phe Gly Asn Asp Ala 

280 285 290 

gtt ttg ate aae teg ttt gge eag eet att ctt aea gat gaa ttt ggt 965 
Val Leu lie Asn Ser Phe Gly Gin Pro lie Leu Thr Asp Glu Phe Gly 
295 300 305 310 

gtt aet tat eaa eet etc caa aat gge gca ate tat tat ctt att 1010 
Val Thr Tyr Gin Pro Leu Gin Asn Gly Ala lie Tyr Tyr Leu lie 
315 320 325 

tagaagatat tgaaaageaa atgttatggt gctatggata aatattaata taataataaa 1070 
agatttetgc gatttattta gttattaatt agataagaat ttcatttett atcttttaaa 1130 
tttatgaaca atttacagga eatttacatt ttcgagaett tgaaaaataa agaatgaaat 1190 
taagttaaaa aaaaaaaaaa aaaa 1214 

<210> 2 
<211> 325 
<212> PRT 

<213> Arabidopsis thaliana 

<223> Amino acid sequence of protein involved in 

differentiation 

<400> 2 

Met Ser Ser Ser Asn Lys Asn Trp Pro Ser Met Phe Lys Ser Lys Pro 

15 10 15 

Cys Asn Asn Asn His His His Gin His Glu lie Asp Thr Pro Ser Tyr 

20 25 30 

Met His Tyr Ser Asn Cys Asn Leu Ser Ser Ser Phe Ser Ser Asp Arg 

35 40 45 

lie Pro Asp Pro Lys Pro Arg Trp Asn Pro Lys Pro Glu Gin He Arg 

50 55 60 

He Leu Glu Ser He Phe Asn Ser Gly Thr He Asn Pro Pro Arg Glu 
65 70 75 80 

Glu He Gin Arg He Arg He Arg Leu Gin Glu Tyr Gly Gin He Gly 
85 90 95 
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1 



Asp Ala Asn Val Phe Tyr Trp Phe Gin Asn Arg Lys Ser Arg Ala Lys 

100 105 110 

His Lys Leu Arg Val His His Lys Ser Pro Lys Met: Ser Lys Lys Asp 

115 120 125 

Lys Thr Val lie Pro Ser Thr Asp Ala Asp His Cys Phe Gly Phe Val 

130 135 140 

Asn Gin Glu Thr Gly Leu Tyr Pro Val Gin Asn Asn Glu Leu Val Val 
145 150 155 160 

Thr Glu Pro Ala Gly Phe Leu Phe Pro Val His Asn Asp Pro Ser Ala 

165 170 175 

Ala Gin Ser Ala Phe Gly Phe Gly Asp Phe Val Val Pro Val Val Thr 

180 185 190 

Glu Glu Gly Met Ala Phe Ser Thr Val Asn Asn Gly Val Asn Leu Glu 

195 200 205 

Thr Asn Glu Asn Phe Asp Lys lie Pro Ala lie Asn Leu Tyr Gly Gly 

210 215 220 

Asp Gly Asn Gly Gly Gly Asn Cys Phe Pro Pro Leu Thr Val Pro Leu 
225 230 235 240 

Thr lie Asn Gin Ser Gin Glu Lys Arg Asp Val Gly Leu Ser Gly Gly 

245 250 255 

Glu Asp Val Gly Asp Asn Val Tyr Pro Val Arg Met Thr Val Phe lie 

260 265 270 

Asn Glu Met Pro lie Glu Val Val Ser Gly Leu Phe Asn Val Lys Ala 

275 280 285 

Ala Phe Gly Asn Asp Ala Val Leu lie Asn Ser Phe Gly Gin Pro lie 

290 295 300 

Leu Thr Asp Glu Phe Gly Val Thr Tyr Gin Pro Leu Gin Asn Gly Ala 
305 310 315 320 

lie Tyr Tyr Leu lie 
325 

<210> 3 
<211> 1518 
<212> DNA 

<213> Arabidopsis thaliana 

<221> CDS 

<222> (152) ... (1285) 
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<223> Nucleotide sequence coding for a protein involved 

in differentiation 

<400> 3 

tttttattta tctttccttt agccattctg ttccctgtct cttcctcctc tctttttgac 60 
acatcacatc atcatcacat catcattcaa catcaatcat catcatatgc atacacatac 120 
atctgtgttc tgcggatcga gttaattagt t atg get tct teg aat aga cac 172 

Met Ala Ser Ser Asn Arg His 
1 5 

tgg cca age atg tte aag tec aaa cct cat ccc cat eaa tgg caa cat 220 
Trp Pro Ser Met Phe Lys Ser Lys Pro His Pro His Gin Trp Gin His 

10 15 20 

gac ate aac tct cct etc ttg cct tet get tct cac cga tct tct cct 268 
Asp lie Asn Ser Pro Leu Leu Pro Ser Ala Ser His Arg Ser Ser Pro 

25 30 35 

tte tct tea gga tgt gaa gtg gag agg agt cca gag cea aaa cca aga 316 
Phe Ser Ser Gly Cys Glu Val Glu Arg Ser Pro Glu Pro Lys Pro Arg 
40 45 50 55 

tgg aat cca aag cea gag eag att egg ata ctt gaa gea ate ttt aac 364 
Trp Asn Pro Lys Pro Glu Gin lie Arg lie Leu Glu Ala lie Phe Asn 

60 65 70 

tec ggg atg gtg aat cct cca aga gag gag ate agg agg att agg get 412 
Ser Gly Met Val Asn Pro Pro Arg Glu Glu lie Arg Arg lie Arg Ala 

75 80 85 

eag ctt caa gaa tae gge caa gtc ggt gat get aac gte tte tae tgg 460 
Gin Leu Gin Glu Tyr Gly Gin Val Gly Asp Ala Asn Val Phe Tyr Trp 

90 95 100 

tte eaa aac egt aag tee egt agt aaa cac aaa etc ege etc etc cac 508 
Phe Gin Asn Arg Lys Ser Arg Ser Lys His Lys Leu Arg Leu Leu His 

105 110 115 

aac cac tec aaa cac tct etc cet eaa aeg eaa eeg eag ccg eag eeg 556 
Asn His Ser Lys His Ser Leu Pro Gin Thr Gin Pro Gin Pro Gin Pro 
120 125 130 135 

eaa cct teg get tee tet tee tct tee tec tec tet tee tee tee aaa 604 
Gin Pro Ser Ala Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Lys 
140 145 150 
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4 

tec acc aaa ccc cga aaa 

Ser Thr Lys Pro Arg Lys 
155 

ttg ggt ggt agt caa atg 

Leu Gly Gly Ser Gin Met 
170 

etc ttc ccg gtc tec act 

Leu Phe Pro Val Ser Thr 
185 

tec caa tta ggg ttt etc 

Ser Gin Leu Gly Phe Leu 

200 205 

get cca acg tgt acc gga 

Ala Pro Thr Cys Thr Gly 
220 

gtg agt tat gga act cat 

Val Ser Tyr Gly Thr His 
235 

gaa gaa atg agg atg aag 

Glu Glu Met Arg Met Lys 
250 

tac get acc act aat cat 

Tyr Ala Thr Thr Asn His 
265 

aac aat aac ate atg ett 

Asn Asn Asn He Met Leu 

280 285 

act att act act teg eat 

Thr lie Thr Thr Ser His 
300 

cag ett caa gtt caa gcg 

Gin Leu Gin Val Gin Ala 
315 

atg gag ett gaa gtg age 

Met Glu Leu Glu Val Ser 
330 



age aag aac aag aac aac 

Ser Lys Asn Lys Asn Asn 
160 

atg ggg atg ttt cca ccg 

Met Gly Met Phe Pro Pro 
175 

gtc gga ggg ttt gaa ggt 

Val Gly Gly Phe Glu Gly 

190 195 

tec ggt gat atg att gag 

Ser Gly Asp Met He Glu 
210 

etc etg etg agt gag ate 

Leu Leu Leu Ser Glu He 
225 

cat caa caa cac ttg agt 

His Gin Gin His Leu Ser 
240 

atg ttg caa cag cca cag 

Met Leu Gin Gin Pro Gin 
255 

caa ata get tet tac aac 

Gin He Ala Ser Tyr Asn 

270 275 

cat att cet ccc act act 

His He Pro Pro Thr Thr 
290 

tet etc get act gtc cca 

Ser Leu Ala Thr Val Pro 
305 

gac gca cga ata aga gtt 

Asp Ala Arg He Arg Val 
320 

tea gga ccg ttc aat gtg 

Ser Gly Pro Phe Asn Val 
335 



act aat etc tet 652 

Thr Asn Leu Ser 
165 

gaa ccg gcg ttt 700 

Glu Pro Ala Phe 

180 

ate ace gtc tea 74 8 

He Thr Val Ser 

caa caa aaa ccg 7 96 

Gin Gin Lys Pro 
215 

atg aac ggt agt 844 

Met Asn Gly Ser 
230 

gag aaa gaa gtt 892 

Glu Lys Glu Val 
245 

act cag att tgt 940 

Thr Gin He Cys 

260 

aac aac aac aac 988 

Asn Asn Asn Asn 

tet act gee acc 1036 

Ser Thr Ala Thr 
295 

tea act teg gac 1084 

Ser Thr Ser Asp 
310 

ttc ate aat gaa 1132 

Phe He Asn Glu 
325 

agg gat gca ttt 1180 

Arg Asp Ala Phe 

340 
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ggg gaa gag gtt gtt ctg att aat tec gcg ggt cag ccc att gtc acc 1228 
Gly Glu Glu Val Val Leu lie Asn Ser Ala Gly Gin Pro He Val Thr 

345 350 355 

gat gaa tat ggc gtc get ctt cac cct ctt caa cac gga gcc teg tac 127 6 
Asp Glu Tyr Gly Val Ala Leu His Pro Leu Gin His Gly Ala Ser Tyr 
360 365 370 375 

tat ctg ate tagtcgtgtg ggagatttga gtttgaagaa gaaattaaga 1325 
Tyr Leu He 

cctgtctett tetttcaeca tctaetcgta cgtaggetta aatgttaaga ttttataaag 1385 

tattggtttc agttacetgt tgtgacggtg tttatgtatg agtttcggae aacattcaca 1445 

aaactctctc gttaaattgt tgacetaata atatatgatg tgtgtttcat tattaaaaaa 1505 

aaaaaaaaaa aaa 1518 

<210> 4 
<211> 378 
<212> PRT 

<213> Arabidopsis thaliana 

<223> Amino acid sequence of protein involved in 

differentiation 

<400> 4 

Met Ala Ser Ser Asn Arg His Trp Pro Ser Met Phe Lys Ser Lys Pro 

15 10 15 

His Pro His Gin Tirp Gin His Asp lie Asn Ser Pro Leu Leu Pro Ser 

20 25 30 

Ala Ser His Arg Ser Ser Pro Phe Ser Ser Gly Cys Glu Val Glu Arg 

35 40 45 

Ser Pro Glu Pro Lys Pro Arg Trp Asn Pro Lys Pro Glu Gin lie Arg 

50 55 60 

lie Leu Glu Ala lie Phe Asn Ser Gly Met Val Asn Pro Pro Arg Glu 
65 70 75 80 

Glu lie Arg Arg lie Arg Ala Gin Leu Gin Glu Tyr Gly Gin Val Gly 

85 90 95 

Asp Ala Asn Val Phe Tyr Trp Phe Gin Asn Arg Lys Ser Arg Ser Lys 

100 105 110 

His Lys Leu Arg Leu Leu His Asn His Ser Lys His Ser Leu Pro Gin 
115 120 125 
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Thr Gin Pro Gin 
130 

Ser Ser Ser Ser 
145 

Lys Asn Asn Thr 

Phe Pro Pro Glu 
180 

Phe Glu Gly lie 
195 

Met He Glu Gin 
210 

Ser Glu He Met 
225 

His Leu Ser Glu 

Gin Pro Gin Thr 
260 

Ser Tyr Asn Asn 
275 

Pro Thr Thr Ser 
290 

Thr Val Pro Ser 
305 

He Arg Val Phe 

Phe Asn Val Arg 
340 

Ala Gly Gin Pro 
355 

Leu Gin His Gly 
370 



4 

Pro Gin Pro Gin 
135 

Ser Ser Lys Ser 
150 

Asn Leu Ser Leu 
165 

Pro Ala Phe Leu 

Thr Val Ser Ser 
200 

Gin Lys Pro Ala 
215 

Asn Gly Ser Val 
230 

Lys Glu Val Glu 
245 

Gin He Cys Tyr 

Asn Asn Asn Asn 
280 

Thr Ala Thr Thr 
295 

Thr Ser Asp Gin 
310 

He Asn Glu Met 
325 

Asp Ala Phe Gly 

He Val Thr Asp 
360 

Ala Ser Tyr Tyr 
375 



Pro Ser Ala Ser 
140 

Thr Lys Pro Arg 
155 

Gly Gly Ser Gin 
170 

Phe Pro Val Ser 
185 

Gin Leu Gly Phe 

Pro Thr Cys Thr 
220 

Ser Tyr Gly Thr 
235 

Glu Met Arg Met 
250 

Ala Thr Thr Asn 
265 

Asn Asn He Met 

He Thr Thr Ser 
300 

Leu Gin Val Gin 
315 

Glu Leu Glu Val 
330 

Glu Glu Val Val 
345 

Glu Tyr Gly Val 
Leu He 



Ser Ser Ser Ser 

Lys Ser Lys Asn 
160 

Met Met Gly Met 
175 

Thr Val Gly Gly 
190 

Leu Ser Gly Asp 
205 

Gly Leu Leu Leu 

His His Gin Gin 
240 

Lys Met Leu Gin 
255 

His Gin He Ala 
270 

Leu His He Pro 
285 

His Ser Leu Ala 

Ala Asp Ala Arg 
320 

Ser Ser Gly Pro 
335 

Leu He Asn Ser 
350 

Ala Leu His Pro 
365 



<210> 5 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
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<220> 
<221> 
<222> 

<223> Primer 

<400> 5 

gaaga'tctca tca'tgtcc'tc ctcaaac 

<210> 6 

<211> 30 

<212> DNA 

<213> Artificial Sequence 

<220> 
<221> 
<222> 

<223> Primer 

<400> 6 

cggagctcta aataagataa tagattgcgc 

<210> 7 

<211> 32 

<212> DNA 

<213> Artificial Sequence 

<220> 
<221> 
<222> 

<223> Primer 

<400> 7 

gctctagaac aa'tggcttct: tcgaatagac ac 

<210> 8 

<211> 32 



27 



30 
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<212> DNA 

<213> Artificial Sequence 

<220> 
<221> 
<222> 

<223> Primer 
<400> 8 

tcccccgggc tgatcagata gtacgaggct cc 



i 

\ 
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